Numerical evaluation of the Communication-Avoiding Lanczos algorithm
نویسندگان
چکیده
The Lanczos algorithm is widely used for solving large sparse symmetric eigenvalue problems when only a few eigenvalues from the spectrum are needed. Due to sparse matrix-vector multiplications and frequent synchronization, the algorithm is communication intensive leading to poor performance on parallel computers and modern cache-based processors. The Communication-Avoiding Lanczos algorithm [Hoemmen; 2010] attempts to improve performance by taking the equivalence of s steps of the original algorithm at a time. The scheme is equivalent to the original algorithm in exact arithmetic but as the value of s grows larger, numerical roundoff errors are expected to have a greater impact. In this paper, we investigate the numerical properties of the Communication-Avoiding Lanczos (CA-Lanczos) algorithm and how well it works in practical computations. Apart from the algorithm itself, we have implemented techniques that are commonly used with the Lanczos algorithm to improve its numerical performance, such as semi-orthogonal schemes and restarting. We present results that show that CA-Lanczos is often as accurate as the original algorithm. In many cases, if the parameters of the s-step basis are chosen appropriately, the numerical behaviour of CA-Lanczos is close to the standard algorithm even though it is somewhat more sensitive to loosing mutual orthogonality among the basis vectors.
منابع مشابه
A Communication-Avoiding Thick-Restart Lanczos Method on a Distributed-Memory System
The Thick-Restart Lanczos (TRLan) method is an effective method for solving large-scale Hermitian eigenvalue problems. On a modern computer, communication can dominate the solution time of TRLan. To enhance the performance of TRLan, we develop CA-TRLan that integrates communication-avoiding techniques into TRLan. To study the numerical stability and solution time of CA-TRLan, we conduct numeric...
متن کاملAvoiding communication in the Lanczos bidiagonalization routine and associated Least Squares QR solver
Communication – the movement of data between levels of memory hierarchy or between processors over a network – is the most expensive operation in terms of both time and energy at all scales of computing. Achieving scalable performance in terms of time and energy thus requires a dramatic shift in the field of algorithmic design. Solvers for sparse linear algebra problems, ubiquitous throughout s...
متن کاملAsymptotic Waveform Evaluation via a Lanczos Method
In this paper we show that the two-sided Lanczos procedure combined with implicit restarts, offers significant advantages over Pad6 approximations used typically for model reduction in circuit simulation. Keywords-Dynamical systems, Model reduction, Numerical methods, Pad6 approximation, Lanczos algorithm.
متن کاملar X iv : h ep - l at / 9 90 91 31 v 1 1 7 Se p 19 99
The Lanczos algorithm for matrix tridiagonalisation suffers from strong numerical instability in finite precision arithmetic when applied to evaluate matrix eigenvalues. The mechanism by which this instability arises is well documented in the literature. A recent application of the Lanczos algorithm proposed by Bai, Fahey and Golub allows quadrature evaluation of inner products of the form ψ · ...
متن کاملThe Improved Quasi - Minimal Residual
For the solutions of linear systems of equations with unsymmetric coeecient matrices, we propose an improved version of the quasi-minimal residual (IQMR) method by using the Lanczos process as a major component combining elements of numerical stability and parallel algorithm design. For Lanczos process, stability is obtained by a coupled two-term procedure that generates Lanczos vectors normali...
متن کامل